9 research outputs found

    Sheet-Metal Production Scheduling Using AlphaGo Zero

    Get PDF
    This work investigates the applicability of a reinforcement learning (RL) approach, specifically AlphaGo Zero (AZ), for optimizing sheet-metal (SM) production schedules with respect to tardiness and material waste. SM production scheduling is a complex job shop scheduling problem (JSSP) with dynamic operation times, routing flexibility and supplementary constraints. SM production systems are capable of processing a large number of highly heterogeneous jobs simultaneously. While very large relative to the JSSP literature, the SM-JSSP instances investigated in this work are small relative to the SM production reality. Given the high dimensionality of the SM-JSSP, computation of an optimal schedule is not tractable. Simple heuristic solutions often deliver bad results. We use AZ to selectively search the solution space. To this end, a single player AZ version is pretrained using supervised learning on schedules generated by a heuristic, fine-tuned using RL and evaluated through comparison with a heuristic baseline and Monte Carlo Tree Search. It will be shown that AZ outperforms the other approaches. The work’s scientific contribution is twofold: On the one hand, a novel scheduling problem is formalized such that it can be tackled using RL approaches. On the other hand, it is proved that AZ can be successfully modified to provide a solution for the problem at hand, whereby a new line of research into real-world applications of AZ is opened

    Evaluation of Methods for Semantic Segmentation of Endoscopic Images

    Get PDF
    We examined multiple semantic segmentation methods, which consider the information contained in endoscopic images at different levels of abstraction in order to predict semantic segmentation masks. These segmentations can be used to obtain position information of surgical instruments in endoscopic images, which is the foundation for many computer assisted systems, such as automatic instrument tracking systems. The methods in this paper were examined and compared in regard to their accuracy, effort to create the data set, and inference time. Of all the investigated approaches, the LinkNet34 encoder-decoder network scored best, achieving an Intersection over Union score of 0.838 with an inference time of 30.25 ms on a 640 x 480 pixel input image with a NVIDIA GTX 1070Ti GPU

    Registered and Segmented Deformable Object Reconstruction from a Single View Point Cloud

    Full text link
    In deformable object manipulation, we often want to interact with specific segments of an object that are only defined in non-deformed models of the object. We thus require a system that can recognize and locate these segments in sensor data of deformed real world objects. This is normally done using deformable object registration, which is problem specific and complex to tune. Recent methods utilize neural occupancy functions to improve deformable object registration by registering to an object reconstruction. Going one step further, we propose a system that in addition to reconstruction learns segmentation of the reconstructed object. As the resulting output already contains the information about the segments, we can skip the registration process. Tested on a variety of deformable objects in simulation and the real world, we demonstrate that our method learns to robustly find these segments. We also introduce a simple sampling algorithm to generate better training data for occupancy learning.Comment: Accepted at WACV 202

    A Mechatronic Interface for using Oblique-Viewing Endoscopes with Light Weight Robots for Laparoscopic Surgery

    Get PDF
    No mechatronic interface exists between a robotic arm and an oblique-viewing endoscope with rotational actuation and without kinematic singularity. Therefore, we developed an interface between a Franka Emika Panda manipulator and a rigid 30 degree oblique-viewing endoscope by Storz, which can be used during laparoscopic surgery. The design is easily adaptable to other camera heads and robotic arms. As a next step, we will compare the design to a 30 degree endoscope without utilization of the rotational degree of freedom and to a 0 degree endoscope

    Collaborative Control for Surgical Robots

    Get PDF
    We are designing and evaluating control strategies that enable surgeons to intuitively hand-guide an endoscope attached to a redundant lightweight robot. The strategies focus on safety aspects as well as intuitive and smooth control for moving the endoscope. Two scenarios are addressed. The first being a compliant hand-guidance of the endoscope and the second moving the robot’s elbow on its redundancy circle to move the robot out of the surgeon’s way without changing the view. To prevent collisions with the patient and the environment, the robot needs to move respecting Cartesian constraints

    A learning robot for cognitive camera control in minimally invasive surgery

    Get PDF
    Background!#!We demonstrate the first self-learning, context-sensitive, autonomous camera-guiding robot applicable to minimally invasive surgery. The majority of surgical robots nowadays are telemanipulators without autonomous capabilities. Autonomous systems have been developed for laparoscopic camera guidance, however following simple rules and not adapting their behavior to specific tasks, procedures, or surgeons.!##!Methods!#!The herein presented methodology allows different robot kinematics to perceive their environment, interpret it according to a knowledge base and perform context-aware actions. For training, twenty operations were conducted with human camera guidance by a single surgeon. Subsequently, we experimentally evaluated the cognitive robotic camera control. A VIKY EP system and a KUKA LWR 4 robot were trained on data from manual camera guidance after completion of the surgeon's learning curve. Second, only data from VIKY EP were used to train the LWR and finally data from training with the LWR were used to re-train the LWR.!##!Results!#!The duration of each operation decreased with the robot's increasing experience from 1704 s ± 244 s to 1406 s ± 112 s, and 1197 s. Camera guidance quality (good/neutral/poor) improved from 38.6/53.4/7.9 to 49.4/46.3/4.1% and 56.2/41.0/2.8%.!##!Conclusions!#!The cognitive camera robot improved its performance with experience, laying the foundation for a new generation of cognitive surgical robots that adapt to a surgeon's needs

    Data Set: Evaluation of Domain Randomization Techniques for Transfer Learning - Part 0

    No full text
    Transfer Learning Data Set: The total data set contains 1.44m synthetic images and 10k real-world images of size 299x224. Due to file size limitation, the data set is split into two sets. This data set (part 0: DOI 10.5281/zenodo.2581311) contains the real-world images and the synthetic images of perspective 00. The second part (part 1: DOI 10.5281/zenodo.2581469) contains perspective 01 and 10 of the synthetic images. 1. 10k real world images the filename labels the image: -first 6 digits represent the id. -8. digit labels the grasp: 0 -> no grasp, 1 -> grasp -10. digit is empty (reserved for 2nd perspective) -12. digit labels the graspbox: 0 -> green box, 1 -> yellow box -14. digit labels the distractors : 0 -> no distractors, 1 -> distractors -c in the end stands for acolor image, d is reserved for depth (not in use). 2. 1.44m synthetic images the folder labels the enabled technique: -first 2 digits label the perspective: 00 -> standard, 01 -> shake, 10 -> random -3. digit labels the graspbox: 0 -> defaul green box, 1 -> random box -4. digit labels the distractors: 0 -> no distractors, 1 -> distractors -5. digit labels the lighting: 0 -> default lighting, 1 -> random lighting -6. digit labels the mesh randomization: 0 -> default mesh color, 1 -> random mesh color the filename consists of 3 parts: -first 6 digits represent the id. -8. digit labels the grasp: 0 -> no grasp, 1 -> grasp -10.-15. digit equals the folder name and represents the enabled technique -c in the end stands for a color image, d is reserved for depth (not in use)

    Deep learning for semantic segmentation of organs and tissues in laparoscopic surgery

    No full text
    Semantic segmentation of organs and tissue types is an important sub-problem in image based scene understanding for laparoscopic surgery and is a prerequisite for context-aware assistance and cognitive robotics. Deep Learning (DL) approaches are prominently applied to segmentation and tracking of laparoscopic instruments. This work compares different combinations of neural networks, loss functions, and training strategies in their application to semantic segmentation of different organs and tissue types in human laparoscopic images in order to investigate their applicability as components in cognitive systems. TernausNet-11 trained on Soft-Jaccard loss with a pretrained, trainable encoder performs best in regard to segmentation quality (78.31% mean Intersection over Union [IoU]) and inference time (28.07 ms) on a single GTX 1070 GPU
    corecore